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ON  RESONANCE  EXTRACTION  AND  WAVEFORM  FITTING 
FOR  TRANSIENT  DATA;  PRONY'S  METHOD 


INTRODUCTION 

The  estimation  of  the  resonances  (natural  frequencies)  of  a  system,  from 
observation  of  a  noisy  response,  is  an  important  problem  of  frequent  occurrence 
in  practical  situations.  Usually,  the  number  of  observations  is  considerably 
greater  than  the  number  of  resonances,  and  the  task  of  utilizing  these  "extra" 
data  to  reduce  the  errors  of  estimation  must  be  accomplished  without  an  exces¬ 
sive  amount  of  computational  effort  or  trial -and-error.  Accordingly,  the 
original  exact-fit  procedure  by  Prony  has  to  be  generalized  to  a  least-squares 
approach.  In  this  manner,  the  amount  of  data  processing  is  minimized,  with 
all  the  nonlinear  processing  being  concentrated  in  the  solution  for  the  roots 
of  a  polynomial. 

The  purpose  of  this  report  is  to  develop  and  explain  this  least-squares 
solution  and  to  show  its  close  connection  to  linear  prediction.  The  first 
section,  on  Mathematical  Details,  sets  up  the  problem  definition  and  intro¬ 
duces  the  terms  necessary  to  interpret  recent  work  by  Auton  and  Van  Blaricum  [1] 
described  in  the  next  section.  Some  important  points  about  the  waveform¬ 
fitting  technique  are  explained,  and  some  possible  alternative  approaches  are 
mentioned.  A  more  general  model  is  considered  in  appendix  A,  and  a  generaliza¬ 
tion  to  linear  prediction  is  developed  in  appendix  B,  which  subsumes  forward 
prediction,  backward  prediction,  and  a  weighted  linear  combination  in  general. 
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MATHEMATICAL  DETAILS 


IDEAL  EXPONENTIAL  MODEL 


N-l 


Suppose  a  sequence  {gm}g  of  N  points  is 


n 


m 


\  =  k^i  ck  “^V0  s  ^  ck  “k 


given  exactly  by  the  model* 

for  0  S  n  £  N-l.  (1) 


N-l 

That  is,  sequence  {gm>o  is  a  sum  of  n  complex  exponentials.  Without  loss 

of  generality,  we  presume  that  all  the  {C|<}  are  nonzero  for  1  <  k  <  n. 


Consider  the  error  (in  linear  prediction)  of  attempting  to  represent  gm 
in  terms  of  its  past  n  values;  that  is,  for  n  <  m  <  N-l,  consider  linear 
prediction  error  (where  a  =  -1) 


n 

-  Z  a. 
j=l  J 


n 

9__  j  -  *  Z  a  • 
m  J  j=0  J 


n 

gm_  _■  =  -  z  a  ■ 
m  J  j=Q  J 


I  c,  m"- j 
k=l  *  * 


n 

Z 

k=l 


n 


Z 

j=0 


n 

Z 

k=l 


-tC1- 


'“n-l 


(2) 


where  we  substituted  (1)  and  interchanged  s  urinations.  Now  we  choose  the  n 
linear  coefficients  T a ^ ^  such  that 


Mk  "  “^k"1  "  •••  "  an-lMk  ‘  an  =  0  for  1  =  k  *  n* 


This  requires  solution  of  n  linear  equations  for  the  n  unknowns 
presuming  that  the  n  quantities  {y .  are  known.  In  fact,  the  general 
solution  is  1 


a.  =  (-1)3”*  (sum  of  all  possible  products  of  j  different  y's) 

J 

for  1  <  j  <  n;  (4a) 


*This  can  be  generalized  to  include  terms  like  Cym  +  Dmym;  see  appendix  A. 
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that  is. 


Oil  =JJj  +  p2  +**+  Mn 

<*2  =-(MiM2+MiH3+-  •  •+MiMn+M2M3+M2M4+*  •  •+Mn..1Mn) 


or 


®  *• 

an  =  (~l)n  MiM2* • -Mn- 
With  this  choice  of  (2)  and  (3)  yield 


Q 


g 


m 


m 


I  a.  g„  ,  =  0  for  n  S  m  %  N-l, 
j=l  J  m"J 


n 

I  on  gm_j  for  n  £  m  S  N-l, 


(4b) 


(5a) 

(5b) 


That  is,  when  sequence  {gm}(j  1  is  generated  as  a  sum  of  n  complex  exponentials 
according  to  (1),  the  sequence  value  gm  can  be  determined  exactly  as  a  forward 
linear  combination  of  the  previous  n  values,  provided  that  n  <  m  <  N-l.  The 
restriction  of  m  to  this  range  is  due  to  the  fact  that  gm  is  presumed  unknown 
for  m  <  0  and  for  m  >  N-l;  thus  only  the  "valid,"  or  available,  data  are 
employed  in  (2)  and  (5b). 

It  is  important  to  observe  that  the  n  linear  predictive  coefficients 
in  (4b)  depend  on  (u|<}^  but  are  completely  independent  of  the  values  of  the 

exponential  strengths,  or  "residues,"  {C^l^  in  (1).  Also,  if  the  -“jlj  were 

known  instead  of  the  *  ^en  (2)  can  be  solved  for  the  as  tbe  n  r00^s 

of  an  n-th  order  polynomial. 

A  more  general  approach  to  linear  prediction  is  developed  in  appendix  B. 

It  subsumes  the  forward  prediction  (given  above),  backward  prediction,  and  a 
weighted  linear  combination  in  general. 


ACTUAL  MEASURED  DATA 

N-l 

Now  suppose  that  some  arbitrary  data  sequence  {fm)o  has  been  measured 
or  is  available,  and  we  want  to  choose  the  2n  parameters  in  the  exponential 

model  (1)  such  that  the  error  of  representing  data  {f m  }g  1  by  this  model  is 

minimized  in  some  sense.  Guided  by  (5b),  we  first  let  linearly  predicted 
value 
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fm  E 


n 

X 

j=l 


J  m-j 


for  n  i  m  S  N-l, 


(6) 


where  the  linear  coefficients  {<*•}?  are  to  be  selected.  In  particular,  we 

J  * 

define  the  prediction  error  sequence  (called  the  equation  error  in  [1]) 


X  f.  <  for  n  5  m  $  N-l. 
j=l  J  m'J 


(7) 


This  is  also  called  Prony's  difference  equation.  We  then  define  the  total 
squared  prediction  error  as* 


.  N-l 
E  =  X  w 
m=n 


N-l  /  n  v2 

m  =  Wm  Vro  '  j=l  “j  Vj) 


(8) 


*  N-l 

where  {wm>n  are  a  set  of  N-n  positive  weights.  E  is  called  the  quadratic 
error  in  [1]. 

Minimization  of  total  squared  prediction  error  E  by  choice  of  coefficients 
is  accomplished  by  setting 


^  =  0  for  Hk<=n.  (9) 


This  results  in  n  linear  equations  in  the  n  unknowns  {a^}".  We  solve  these 
equations  for  the  {ct^}"  that  minimize  prediction  error  £. 

We  must  point  out  an  alternative  approach  to  the  minimization  of  E. 

One  could  instead  minimize  the  Chebyshev  error;  that  is,  we  could  choose  the 
{a->y  in  (7)  so  as  to  minimize  the  quantity 


max 

nSmSN- 1 


n 

X 

j=l 


J  m-j 


(10) 


That  is,  the  maximum  error  in  prediction  is  minimized.  Although  this  approach 
yields  nonlinear  equations  in  the  {a-}?,  efficient  linear  programming  techni- 

J  ^ 

ques  exist  for  this  problem.  How  well  this  minimax  error  criterion  compares 
with  the  total  squared  error  criterion  is  not  known. 


*We  are  presuming  real  data  sequences  here;  generalization  to  complex  data  is 
possible. 

4 


Given  the  values  for  whether  obtained  via  (9)  or  (10),  we  can  now 

solve  (3)  for  the  {y|<^l-  Some  of  these  latter  values  may  be  complex,  even 
though  all  the  {a^}"  are  real  for  real  data  {f^lg-*;  this  situation  is 
treated  in  [2],  p.  380. 

Guided  now  by  (1),  we  next  let  model  data  value* 

n 

fm  =  I  Ck  Hk  for  0  S  m  £  N-l.  (11) 

Then  we  define  data  error  sequence  (called  the  true  error  in  [1]) 

*•  1  f.  '  f»  =  f„  ■  Ck  •£  for  0  S  «  S  N-l-  (12> 


In  a  similar  fashion  to  (8),  we  also  define  the  total  squared  data  error  as 


N-l  N-l  / 

E  E  1  wm  e*  =  l  wm 


m=0  m  m 


m=0 


n 

i 

k=l 


(13) 


-  N-l 

where  {w  }_  are  a  set  of  N  positive  weights.  To  minimize  total  error  E, 
we  set 


3E 


0  for  IS  j  S  n, 


(14) 


thereby  obtaining  n  linear  equations  in  the  n  unknowns  (C.)?.  (The quantities 
{y)<}1  are  already  known  at  this  point;  see  the  discussion  preceding  (11)). 

We  solve  these  n  equations  for  the  {C.}?  that  minimize  E. 

J  ^ 

An  alternative  approach  to  the  minimization  of  E  is  to  minimize  the 
Chebyshev  error;  that  is,  choose  the  { C.  in  (12)  so  as  to  minimize  the 
quantity  K  1 


max 

OSmSN-1 


n 

1 

k=l 


(15) 


♦This  presumes  that  all  the  roots  are  distinct;  if  on  the  other  hand, 

we  had,  for  example,  ^  =  u2»  then  we  need  CjpJ1  +  rather  than 
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Again,  the  performance  quality  of  (10)  and  (15)  is  not  known. 

At  this  point,  we  have  a  "fitted"  waveform, 

n  _ 

Z  c.  M™  for  0  <  m  i  N-l,  (16) 

k=l  K  K 

N-l 

to  the  original  given  data  sequence  {f^g  •  However,  it  should  be  observed 

that  the  fit  was  obtained  via  a  two-staqe  sequential  procedure.  Namely,  we 
first  minimized  total  prediction  error  E  to  find  the  linear  coefficients 

and  from  them,  solved  the  polynomial  of  (3)  for  its  roots 

(These  latter  quantities  are  called  the  resonances  in  [1]).  Then,  with 

these  known  values  for  total  data  error  E  was  minimized,  thereby 

determining  the  strengths  (residues)  of  each  of  the  known  exponential 

components 

Both  error  definitions,  (7)-(8)  and  (12)-(13),  utilize  and  "fit"  the 

N-l 

available  data  sequence  {On  ,  but  in  two  different  senses,  the  first  via 
linear  prediction,  and  themsecond  via  an  exponential  model.  The  worst  non¬ 
linear  data  processing  encountered  in  this  two-stage  procedure  is  the 
solution  of  an  n-th  order  polynomial,  (3),  for  all  its  roots  This 

sequential  procedure  will  not  realize  as  small  an  error  as  direct  minimiza¬ 
tion  of 


N-l 

Z 

m=0 


n 

Z 

k=l 


(17) 


via  simultaneous  choice  of  {Ck>J  and  {yk>!J\  However,  this  latter  approach 

is  highly  nonlinear  in  the  {uk}!j|,  ancl  no  direct  (nonrecursive)  solution  is 

known.  Of  course,  a  gradient  search  on  (17)  could  be  employed,  using  as 
starting  values,  those  obtained  above  via  the  two-stage  sequential  procedure. 
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SOME  RECENT  WORK 

The  source  of  the  following  results  and  comments  is  the  work  by  Auton 
and  Van  Blaricum  [11 .  The  solution  for  the  coefficients  {ojlj  in  1S 

called  the  reduced  or  inhomogeneous  solution;  see  [1],  vol .  I,  p.  2-5. 

This  traditional  solution,  unfortunately ,  tends  to  zero  as  the  white 
(independent)  noise  component  in  gets  larger.  A  remedy  to  this 

undesired  behavior  is  furnished  by  employing  instead,  the  weakest  eigen¬ 
vector  of  the  matrix  Q'Q,  where  Q  is  the  data  matrix  formed  by  arranging 

the  given  data  {fm}n~*  in  columns  in  a  particular  fashion;  see  [1],  vol.  I, 
p.  2-2.  (An  equivalent  interpretation  is  that  QTq  or  Q  are  approximated  by 
matrices  of  lower  rank,  i.e.,  singular  matrices.)  It  has  been  found  that 
the  weakest  eigenvector  of  Q‘Q  is  less  dependent  on  the  absolute  noise  level 
and  can  furnish  more  useful  values  for  the  resonances  {y|<}i  than  can  the 
inhomogeneous  solution.  Physically,  the  "best"  linear  prediction  of  a  noisy 
waveform  tends  to  zero,  whereas  an  eigenvector  can  maintain  all  its  compo¬ 
nents  nonzero,  regardless  of  the  absolute  noise  level.  At  present,  the 
weakest  eigenvector  solution  is  judged  to  be  the  best  of  all  iterative  and 
noniterative  methods  for  estimating  the  resonances  f ;  see  [1],  vol.  I, 

p.  2-28.  K  1 

When  the  number  of  resonances,  n,  in  (1)  is  unknown,  its  determination 

or  estimation  must  be  made  from  the  available  data  {fj?!"1.  If  k  is  the 

m  u 

true  (unknown)  number  of  resonances,  and  n  is  the  hypothesized  number, 
there  are  n-k  extraneous  resonance  estimates  produced.  A  maximum  likelihood 
procedure  developed  in  [1]  and  applied  to  the  z  smallest  eigenvalues  (for 
various  values  of  z)  has  been  found  to  give  reasonable  estimates  of  k.  An 
alternative  approach,  employing  time  reversal  of  the  data  sequence,  seems 
to  separate  extraneous  resonances,  but  more  study  is  suggested;  see  [1],  vol.  I, 
p.  3-26. 
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CONCLUSIONS 

The  usual  problems  associated  with  Prony's  method,  regarding  sensitivity 
to  noise,  have  been  attributed  to  dense  sampling  and  bias.  If  both  of  these 
problems  are  treated  properly  and  the  weakest  eigenvector  is  employed, 

Prony's  method  produces  excellent  estimates  of  the  resonances,  even  from  data 
with  high  noise  levels;  see  [1],  vol .  1,  p.  4-8. 

Studies  on  some  of  these  still -unanswered  questions  about  alternative 
procedures  for  order  selection  and  resonance  estimation  will  continue. 
Certainly,  further  improvements  in  the  procedures  and  performance  will  ensue. 
Applications  to  real  measured  data  have  yet  to  be  made,  however;  see  [1], 
vol.  I,  pp.  5-2  and  5-3. 
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Appendix  A 
A  MORE  GENERAL  MODEL 

Instead  of  (1)  of  the  main  text,  suppose  that  sequence  value 


gm  =  Z  Ck  Mk  +  2  Dk  n>Mk  for  0  <  m  S  N-l, 
k=l  k  3 


(A.  1) 


where  p  can  be  larger  or  smaller  than  n.  Then  for  n+p  <  m  <  N-l,  consider 
linear  prediction  error  “  •* 


9m  "  °j  Vj  "  pj  Vn-j 

-  Z  a  .T  Z  C.  p™  3  +  Z  Dk(ur  j)p™  3] 

j=0  JLk=l  K  K  k=l  *  K  J 


-1) 


P  r  n 
1  1 
j=l  JLk=l 


ck  pk""'j  + 


^  Ok(m-n-j)pJ  n  3J 


n  „  f  n  -i  P  n 

-  I  Ckli*  I  a.  p  3  +  Z  p.  p"  J 

k=l  k  k  Lj=0  3  K  j=l  3  K  J 

-  Z  D.  pj  T  Z  a.(m-j)p’3  +  Z  p,(m-n-j)pkn  3|. 

k=l  K  K  Lj=0  3  K  j=l  3  K  J 


(A. 2) 


The  quantities  in  brackets  can  be  made  zero  for  n+p  <  m  <  N-l,  by  setting 
both 


n  -i  p  - 

Z  a.  Mk  +  1  Pj  Pk 

j=0  J  K  j=l  J 


n-J  . 


0  for  1  i  k  %  n 


and 


Z  a.(m-j)wJ  +  1  Pj(««-n-j)pk 
j=0  3  k  j=l  J 


’n‘j  =  0  for  1  a  $  p. 


(A. 3) 


(A.4) 
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This  combination  constitutes  n+p  linear  equations  in  the  n+p  unknowns  {a.}? 

n  J  ^ 


and  { j ;  ctQ  -  -1. 


These  equations  can  be  put  in  the  form 


"o  Pk+P  +  a^k+P  1  +"‘+  Vk  +  PlMk  1  +---+  PD  =  0  for  ISkSn. 

(A. 5) 

<*im|J+P  1  ♦.  annpj  +  Mn+DpjJ”1  +...+  Pp(n+p)  =  0  for  m  ^  p. 

(A. 6) 


So  sequence  value  can  be  determined  exactly  as  a  linear  combination  of 
its  previous  n+p  values,  for  n+p  <  m  <  N-l.  Notice  that  coefficients  {c^}" 

and  depend  on  (where  q  *  max(n,p)),  but  not  on  strengths  {Ck>" 

or  {D^.  See  also  [3],  pp.  174-175. 
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Appendix  B 

EIGENVECTOR  GENERALIZATION  OF  LINEAR  PREDICTION 


IDEAL  MODEL 

The  starting  point  is  again  (1)  of  the  main  text.  We  now  generalize  (2) 
of  the  main  text  to  the  form 


e  s  2  a.  g  .  for  n  £  m  i  N-l,  (B.l) 

m  j_Q  J  m-j 


where  all  the  {c*j}g  are  arbitrary  for  the  moment.  It  follows,  from  substitu¬ 
tion  of  ( 1 )  of  tne  main  text  in  (B-l),that 


n 

=  I 
j=0 


n 

«j  1 

J  u- 


n 


Ck  "l'*  *  1 

k-i  *  *  k=l 


ck  A  }l0  "j 


n  m_n  n  . 

=  I  C.  pj  n  1  a.  J  for  n  S  m  S  N-l. 
k=l  K  j=Q  J  K 


(B.2) 


Now  let  us  set 


n 


I 

j=0 


a  uT  +. ..+  a  ,  U|.  +  a  =0  for  1  £  k  S  n, 
o  k  n-l  K  n 


(B.3) 


by  choice  of  {aj}(j.  Since  there  are  only  n  equations  in  (B.3),  but  n+1 
unknowns,  we  will  not  get  a  unique  solution  for  the  {a.}!j  unless  we  restrict 
them  somehow.  Also,  we  must  disallow  the  zero  solution'? 

Observe  that  if  we  had  used  only  n  coefficients  {ajlg-1  (B.l),  we 
would  have  obtained,  instead  of  (B.3),  n  equations  in  n  unknowns.  However, 
the  only  solution  to  these  equations  is  the  zero  solution  a.  =  0  for  all  j, 
which  Is  useless.  .  J 

Before  we  consider  the  restriction  on  {as )« »  observe  that  substituting 
(B.3)  in  (B.2)  yields  J  u 
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n 

e  =  I  a.  g  .  =  0  for  n  £  m  s  N-l.  (B.4) 

ffl  j=o  J  m"J 

That  is,  we  can  find  an  infinite  number  of  linear  combinations  of  n+1  adja¬ 
cent  values  of  sequence  {gm^g  generated  via  (1)  of  the  main  text, 

which  are  identically  zero  for  all  possible  locations  of  the  (n+l)-long 
average  within  the  record  of  length  N. 

Now  to  get  back  to  the  solution  of  (B.3)  for  the  coefficients  {a.}",  we 
observe  that  the  linear  predictive  approach  considered  in  (2)  et  seq. 
of  the  main  text  amounts  to  choosing  ag  =  -1;  this  results  in  a  unique  p 
solution  for  the  n  linear  equations  (B.3)  in  the  remaining  n  unknowns 
and  is  called  forward  prediction  by  virtue  of  form  (5b)  of  the  main  text. 

An  obvious  alternative  would  be  to  select  or  =  -l*  ln  which  case  (B.3)  and 
(B.4)  would  yield  a  unique  solution  for  {a.}j2_1,  and 

gm-n  =  Vm  +*’*+  Vl  Wl  for  n  *  m  5  M‘1-  <B-5) 


That  is,  we  are  doing  backward  linear  prediction  to  obtain  the  sequence 
values.  But  observe  that  both  of  these  cases  are  specializations  of  the 
linear  constraint 

CTA  =  1  (B.6) 

on  the  coefficients  t“j^g»  where 


are  column  matrices.  Constraint  (B.6)  prevents  the  zero  solution,  and  when 
combined  with  (B.3),  gives  a  unique  solution  for  A.  We  can  normalize  the 
matrix  of  constants,  C,  such  that 

CTC  =1  (or  K  if  desired),  (B.8) 

without  loss  of  generality.  Forward  or  backward  prediction,  respectively,  cor¬ 
responds  to  choosing  all  the  {c,}g  equal  to  zero  except  for  edge  elements 
c0  or  cn,  respectively,  equal  toJ-l.  So,  generally,  we  can  realize  the  linear 
combination. 
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n 

1  a.  gm  .  =  0  for  n  i  m  i  N-l,  (B.9) 

j=0  J  m"J 


subject  to  {orj}Q  satisfying  the  1  inear  constraint  (B.6),  which  guarantees  a 
nonzero  solution.  C  is  an^  vector  satisfying  (B.8). 


ACTUAL  MEASURED  DATA 


N-l 

Now  consider  that  measured  data  {fm>o  are  available.  Instead  of 
linear  prediction  (6)  of  the  main  text,  consider  the  more  general  linear 
combination  (as  in  (B . 1 ) ) 


d 


m 


n 

j=0  J 


for  n  S  m  £  N-l, 


(B.10) 


where  set  {<*.}" 

J  ^ 


is  not  yet  specified. 


Define  error  and  data  matrices 


'dn 

'  * • • 
n  n-l 

f0 

dn+l 

C  — 

fn+l 

fl 

_  dN-l  _ 

,  F  - 

_  fN-l  *  *  * 

• 

• 

Vl-n 

(N-n)x(n+l).  (B. 11) 


Then  (B.10)  can  be  expressed  as 


D  =  FA 


(B. 12) 


where  we  used  (B.7). 

Now  we  want  to  minimize  the  total  quadratic  error  of  (B.10),  namely. 


l  d*  =  DTD  =  ATFTFA  (B. 13) 

m 

m— n 


by  selection  of  A,  but  subject  to  linear  constraint  (B.6)  on  A,  which  guar 
antees  a  nonzero  solution.  C  is  an  arbitrary,  yet-unspecified  matrix. 
Accordingly,  we  use  a  Lagrange  multiplier  2x  and  look  for  an  extremum  of 


ATS  A  -  2X  CTA, 


(B. 14) 
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where  we  have  defined 


S  =  FTF 


(n+l)x(n+l)  matrix. 


(B. 15) 


S  is  easily  seen  to  be  a  nonnegative  definite  matrix;  it  generally  has  full 
rank  when  N  >  2n.  Completing  the  square  in  (B.14),  we  rewrite  it  as 


(A  -  AS~1C)T  S(A  -  AS-1C)  -  \CJS~1C. 


(B. 16) 


The  extremum  is  then  obviously  realized  for  coefficient  matrix 


Aq  =  xS-1C. 


(B. 17) 


To  evaluate  x,  we  have  to  satisfy  the  linear  constraint  (B.6): 

ACTS'lC  =  1,  A  =  -y1--!  * 

c's‘  C 

The  best  coefficient  set  is  then,  from  (B.17), 


(B. 18) 


A  _  5-  C 

Ao  -1 
u  c's  c 


(B.19) 


(Thus  the  best  coefficients  are  proportional  to  the  first  column  of  S_i  for 
forward  linear  prediction,  or  to  the  last  column  for  backward  linear  prediction.) 
The  corresponding  minimum  value  of  the  total  quadratic  error,  (B.13),  is 


T  - 1  -1 

T  c's  SS  C 
A0  SA0  =  — =f  _j. — j — 
U  (c's  C) 


T  -1 

c's  c 


(B.20) 


(This  denominator  reduces  to  the  0,0  element  of  S"*  for  forward  linear 
prediction,  or  to  the  n,n  element  of  S-*  for  backward  linear  prediction.) 

But  this  result,  (B.20),  obviously  depends  on  the  particular  values 
assigned  to  the  constraint  vector  C  in  (B.6).  The  question  then  arises  as 
to  what  constraint  vector  would  yield  further  reduction  of  error  (B.20).  To 
determine  this,  let  matrix  S,  defined  in  (B.15),  have  eigenvalue  matrix 


A  = 


A«  ^  A,  ^. . .^  A  , 


(B.21) 
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and  modal  (eigenvector)  matrix 


Then 


(B.22) 


SE  =  EA  (B.23) 

or 

Se^  =  A^  for  0  <  k  <  n.  (B . 24) 

By  taking  the  inverse  of  (B.23),  and  pre-  and  post-multiplying  by  E,  we 
obtain 


S" 


1 


E  = 


(B.25) 


or 


-i 

S 


for  0  £  k  £  n  , 


(B.26) 


which  we  will  need  below.  The  inverse  matrix  has  the  same  eigenvectors  but 
the  inverse  eigenvalues  of  S. 

Now  any  n+1  column  matrix  can  be  expressed  in  terms  of  the  eigenvectors 
of  S.  In  particular,  suppose  we  let 


n 

C  =  2  b.  e.  . 
k=0  * 


(B.27) 


Recalling  normalization  (B.8),  we  have  the  constraint  on  the  (b^lg.- 


n  T  n  2 

2  bkbo  ekefe  =  1  bx  =  1  ’ 

k,£=0  k  £  K  *  k=0  * 


(B.28) 


since  the  eigenvectors  {ek}0  are  orthonormal.  If  we  substitute  (B.27)  in 
(B.20),  the  denominator  is  given  by 


T  -1  n  T  -1  "  T 

C  S  C  =  2  b.b0  e’  S  e  =  2  b^b*  t,.  \0  e 

k,£=0  K  1  K  *  k,£=0 


n 


_x 

k UZ  ck  '‘£  fc£ 


=  k  j=0  bkb£  h  S  j0  bJAk’ 


(B.29) 
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where  we  employed  (B.26)  and  the  orthonormality  of  the  eigenvectors.  Now 
since  we  want  to  minimize  (B.20),  we  must  maximize  (B.29),  but  subject  to 
(B.28).  Obviously  the  best  choice  of  f b^}g  is  given  by 


bQ  =  ±  1,  bk  =  °  for  1  U  <=  n,  (B.30) 


where  xo  is  the  smallest  eigenvalue  of  S;  see  (B.21).  Thus 

Minimum  total  quadratic  error  =  "c^oS^o}  =  N)  ’  (B.31) 

which  is  the  smallest  eigenvalue  of  S  defined  in  (B.15). 

Now  we  can  employ  result  (B.30)  in  (B.27)  and  (B.19)  to  find  the  best 
coefficient  set  AQ.  We  have  C  =  +  eg,  and  (B.19)  becomes 


0  T  -1 
e  S  e 
o  o 


_  + 


_i 

K  e 
o  o 

T  -1 
e\  e 
o  oo 


±  e 


0  ' 


(B.32) 


where  we  used  (B.26).  Thus  both  the  constraint  vector  and  the  best  linear 
weighting  of  the  data  in  (B.10)  are  equal  to  the  weakest  eigenvector  of  the 
matrix  S  =  F'F,  where  F  is  the  data  matrix  defined  in  (B.ll). 

We  can  now  return  to  (B.3)  to  solve  for  the  { >5 ,  where  we  use  the 
components  of  the  weakest  eigenvector  of  S  for  the  that  ^s’  we  use 


V 

- 1 

n 

o 

o 

_ i 

*1 

• 

=  ± 

eoi 

• 

an 

m  m 

■ 

<D  • 
O 

3 

*  - 

What  we  have  done  is  to 

find 

the  best  linear  c 

(B.33) 


quadratic  error  (B.13)  is  minimized.  The  end  result  is  the  same  as  if  we 
had  minimized  (B.13)  directly,  subject  only  to  constraint 


T  n 

a'a  =  1  0(2  =  1. 

j=0  3 


(B.34) 


This  latter  interpretation  corresponds  to  the  best  A  vector  in  (n+1) -space, 
with  Its  tip  on  the  unit  sphere,  that  minimizes  the  total  quadratic  error. 
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